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Abstract. Given a system of equations in a "random" finitely 
generated subgroup of the braid group, we show how to find a small 
ordered list of elements in the subgroup, which contains a solution 
to the equations with a significant probability. Moreover, with a 
significant probability, the solution will be the first in the list. This 
gives a probabilistic solution to: The conjugacy problem, the group 
membership problem, the shortest presentation of an element, and 
other combinatorial group-theoretic problems in random subgroups 
of the braid group. 

We use a memory-based extension of the standard length-based 
approach, which in principle can be applied to any group admitting 
an efficient, reasonably behaving length function. 



1. The general method 

1.1. Systems of equations in a group. Fix a group G. A pure 
equation in G with variables X{, i G N, is an expression of the form 

(1) = 6, 

where hi, . . . , k n G N, ci, . . . , a ri G {1, —1}, and b is given. A paramet- 
ric equation is one obtained from a pure equation by substituting some 
of the variables with given (known) parameters. By equation we mean 
either a pure or a parametric one. Since any probabilistic method to 
solve a system of equations implies a probabilistic mean to check that 
a given system has a solution, we will confine attention to systems of 
equations which possess a solution. 

Given a system of equations of the form (pQ), it is often possible 
to use algebraic manipulations (taking inverses and multiplications of 
equations) in order to derive from it a (possibly smaller) system of 
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equations all of which share the same leading variable, that is, such 
that all equations have the form 

(2) XW % = bt, 

where X is one of the variables appearing in the original system. The 
task is to find the leading variable X in the system <^j. Having achieved 
this, the process can be iterated to recover all variables appearing in the 
original system (JTJ. In the sequel we confine our attention to systems 
consisting of one or more equations of the form (TSJ). 



1.2. Solving equations in a finitely generated group. The fol- 
lowing general scheme is an extension of one suggested by Hughes and 
Tannenbaum [B] and examined in [2J. Our new scheme turns out dra- 
matically more successful (compare the results of Section [2] to those in 
0). 

It is convenient to think of each of the variables as an unknown 
element of the group G. Assume that the group G is generated by 
the elements and that there exists a "reasonable" length 

function £ : G — > IR + , that is, such that the expected length tends to 
increase with the number of multiplied generators. 

Assume that equations of the form (J2J, % = 1, . . . , k, are given. We 
propose the following algorithm: Since X 6 G, it has a (shortest) form 

The algorithm generates an ordered list of M sequences of length n, 
such that with a significant probability, the sequence 

((jl,^l),(j2,^2),---,O'n,0V)) 

(which codes X) appears in the list, and tends to be its first member. 
The algorithm works with memory close to M ■ n, thus M is usually 
chosen according to the memory limitations of the computer (see also 
Remark O]). 

Step 1: For each j = l,...,m and cr e {1,-1}, compute aj a bi = 
aJ a XWi for each i = 1, . . . , k, and give (j, a) the score Yli=i 
£(aj a bi). Keep in memory the M elements (j, a) with the least 
scores. 

Step s > 1: For each sequence ((jx,cri), • • • , (j s -i, °s-i)) °ut of the M se- 
quences stored in the memory, each j s = 1, . . . , m and each 
a s G {1,-1}, compute the sum of the lengths of the elements 

°r 1 • • • a rfc) = a 7>rr • • • *r xw ^ 
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over i = 1, . . . , k, and assign the resulting score to the sequence 
((ji, o"i ),..., (js, Keep in memory only the M sequences 
with the least scores. 

We still must describe the halting condition for the algorithm. If it is 
known that X can be written as a product of at most n generators, 
then the algorithm terminates after step n. Otherwise, the halting 
decision is more complicated. In the most general case we can decide 
to stop the process when the sum of the M scores increases rather than 
decreases. However, in many specific cases the halting decision can be 
made much more effective - see the examples below. 
We describe several applications of the algorithm. 

Example 1.1 (Parametric equations). If some of the words Wi in the 
equations §Z§ begin with a known parameter P i: then the heuristic 
decision when to stop can be made much more effective: If at some 
step X was completely peeled of the equation, then we know the words 
Wi. To test this, for each of the M suggestions for X, we calculate 
the words Wi and check whether the sum of the lengths £(P~ 1 Wi) is 
significantly smaller than that of the lengths £(Wi). In fact, this allows 
us to determine, with significant probability, which of the M candidates 
for X is the correct one. 

Example 1.2 (The Conjugacy Problem and its variants). The approach 
in Example 11.11 can also be applied in the case that the system of 
equations (j2J) consists of a single equation. This is the case, e.g., in the 
parametric conjugacy problem, where XPX' 1 and P are givenj and we 
wish to find X. Note that in this case the algorithm can be modified to 
become much more successful if at each step s we peel off the generator 
ajj from both sides of the element (more precisely, we peel off aj s s from 
the left and a~ s aa from the right). 

Observe, though, that if n is known in advance (as in many applica- 
tions, e.g., [HE]), then in principle the original algorithm works, which 
means that we can solve the conjugacy problem even if we do not know 
the conjugated element P. 

Example 1.3 (Group Membership and Shortest Presentation problems). 
Assume that G is a finitely generated subgroup of some larger group 
L. Given g £ L, we wish to decide whether g £ G. In this case we 
simply run our algorithm on g using the generators of G, and after 
each step check whether g is coded by one of our M sequences. This 
also provides (probabilistically) a way to write an element g £ G as 
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a product of the generators of G, and with a significant probability it 
will be the shortest way to write it this way. 

Remark 1.4 (Complexity). Note that the parameter M determining the 
length of the final list also affects the running time of the algorithm. 
As stated, if it runs n steps then it performs about 

n 

^2 kM ( s + 2m ) = n ( n + 4m + l)kM/2 

s=l 

group multiplications and 2kmnM evaluations of the length function I. 
(Recall that m denotes the number of the generators of the group, and 
k denotes the number of equations.) The running time can be improved 
at the cost of additional memory (e.g., one can keep in memory the M 
elements of the form aJ^J' 1 • ■ ■ aj^bi, which were computed at step 
s — 1, to reduce the number of multiplications in step s). Note further 
that the algorithm is completely parallelable. 

In the next section we give experimental evidence for this algorithm's 
ability to solve, with surprisingly significant probability, arbitrary equa- 
tions in "random" finitely generated subgroups of the braid group 
with nontrivial parameters. 

2. Experimental results in the braid group 

In the following definition (only), we assume that the reader has 
some familiarity with the braid group and its algorithms. Some 
references for these are [3j [7] and references therein. 

The Garside normal form of an element w in the braid group Bj\i is 
a unique presentation of w in the form A^ r • pi • • -p TO , where r > is 
minimal and pi, ■ ■ ■ ,p m are permutation braids in left canonical form. 
The following length function was introduced in [2], where it was shown 
that it exhibits much better properties than the usual length function 
associated with the Garside normal form. 

Definition 2.1 Q2J). Let w = Aj/' • pi - • -p m be the Garside normal 
form of w. The Reduced Garside length of w is defined by 

m min{r,m} 
j=min{r,m}+l i=l 

Our major experiment was made in subgroups of B^ with iV = 8, 
which is large enough so that Bn is not trivial, but not too large so 
that we could perform a very large number of experiments. The finitely 
generated subgroups in which we worked were random in the sense that 
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each generator was chosen as a product of 10 randomly^ chosen Artin 
generators^ In this experiment we checked the effectiveness of our 
algorithm for the parameters list (m, n, k, I, M), where: 

(1) m (the number of generators of the subgroup) was 2, 4, or 8, 

(2) n (the number of generators multiplied to obtain X) was 16, 
32, or 64, 

(3) k (the number of given equations of the form (j2J)) was 1, 2, 4, 
or 8, 

(4) / (the number of generators multiplied to obtain the words Wi 
in the equations (jSJ)) was 4 or 8; and 

(5) M (the available memory) was 2, 4, 8, 16, 32, 64, 128, 256, or 512. 

(see Section [l~2l . This makes a total of 3 ■ 3 • 4 • 2 ■ 9 = 648 parameters 
lists, for each of which we repeated the experiment about 16 times. 

X tends to be first. In about 83% of these experiments, X was a mem- 
ber in the resulting list of M candidates. A natural problem is: Assume 
that we increase M. Then experiments show that the probability of X 
appearing in the resulting list becomes larger^ but now we have more 
candidates for X, which is undesired when we cannot check which mem- 
ber in the list is X. However, it turns out that even for large values 
of M, X tends to be among the first few in the list. In 71% of our 
experiments, X was actually the first in the list, and when M = 512, 
the probabilities for X ending in position i = 1,2,3,... is decreasing 
with i, and the first few probabilities are: 0.83, 0.08, 0.03, and 0.01. 

Group membership is often solved correctly. The experiments corre- 
sponding to the group membership problem are those with k — 1: In 
these cases we are given a single element XW and find a presentation 
of X using the given generators; this generalizes the case that we are 
given X and find its presentation, when it is possible (see Section [L~3]) . 
Checking the experiments with k — 1, m = 4 or 8, and M = 512, we 
get a success ratio of 0.98. 

Logistic regression. In order to describe the dependence of the success 
ratio in the parameters involved, we are applying the methods of lo- 
gistic regression. Let x±, . . . , £5 denote the logarithms to base 2 of the 
parameters m, n, k, I, M, respectively. Since the probability of success 

2 In this section, random always means with respect to the uniform distribution 
on the space in question. However, we believe that good results would be obtained 
for any nontrivial distribution. 

3 In this section, generator means a generator or its inverse. 
At first glance this seems a triviality, but observe that when M is increased, 
the correct answer has more competitors. 
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p in each case is a number between and 1, a standard linear model 
(expressing p as a linear combination of the variables Xj) is not suitable. 
Instead, it is customary to express the function L = log(p/(l — p)) as 
such a linear combination of the variables Xj (so that p = e L / (1 + e L )). 
This is called the logistic model. Note that under this transformation 
the derivative of p with respect to L is p(l — p), so an addition of 
AL to L will increase p to approximately p + p(l — p)AL. The best 
approximation in this model is 

(3) L w 7.0814 - 1.7165a;i - 0.7547x 2 + 0.1094x 3 + 0.5437x5- 

The quality of the approximation is measured by the variance of the 
error. Since we are taking the best linear approximation, adding any 
variable (even a random independent one) reduces the variance of the 
error. The significance level of a variable Xj roughly measures the prob- 
ability that adding this variable to the others will have its reducing 
effect, assuming it was random. The typical threshold is 0.05: A sig- 
nificance level of 0.05 or below means that the variable has a significant 
contribution to the approximation L, which could not be attained by a 
variable independent of L. In the approximation ([3]), all variables have 
significance level < 0.0003, except for the variable X4 (corresponding 
to I) which has significance level 0.096, and is therefore not taken into 
consideration in the approximation ([3]). 

We have verified that Approximation ([3]) gives a fairly good estima- 
tion of the success probabilities for the tried parameters. 

Doubling the memory. Figure [1] shows the effect of doubling M on the 
success probability, according to Approximation Q. To create this 
figure, we fixed m = 8 and k = 1, and for each M = 2 1 , 2 2 , . . . , 2 10 
we have drawn the graph of the success probability p with respect to 
log 2 (n). 

Remark 2.2. According to Approximation ([3]), in order to maintain 
the success probability when m is doubled, M should be multiplied by 

21.7165/0.5437 ~ g g2 

Another interpretation is as follows. Assume that we wish to decide 
what should the value of M be to get success probability 0.5, that is, 
L = 0. From (J3J it follows that 

x 5 w (-7.0814 + 1.7165x1 + 0.7547x 2 - 0.1094x 3 )/0.5437 

and therefore 

M = 2 X5 w 0.00012 • m 3 - 16 • n im /k - 2 . 

It seems that the prediction capabilities of Approximation (J3J) for 
larger parameters are not bad. 
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Figure 1. The effect of doubling M on the success probability 

Example 2.3. Using Approximation ([3]), the predicted success proba- 
bility for parameters list (16, 128, 8, 8, 1024) is 0.668. An experiment 
for these parameters succeeded in 9 out of 11 tries (about 0.82). 

2.1. Identifying failures. Figure [2] describes the position of the cor- 
rect prefix of X and the average score of all M sequences in the memory 
during the steps of the algorithm (The graphs are normalized for graph- 
ical clarity). Two typical examples are given, both for parameters list 
(2,64,8,8,128). An interesting observation is that when the correct 
prefix is not among the first few, the average length decreases more 
slowly with the steps of the algorithms. 

It turns out that in most of the cases where the correct prefix of X 
does not survive a certain step (that is, it is not ranked among the first 
M sequences), the average length after several more steps almost does 
not decrease. Figure [3] illustrates two typical cases, with parameters 
list (2, 64, 8, 8, 16) (left) and (2, 64, 8, 8, 8) (right). 

This allows us to identify failures within several steps after their oc- 
currence. In such cases one approach is to return a few steps backwards, 
increase M for the next (problematic) few steps, and then decrease it 
again. 

We must stress that these are only typical cases, and several patho- 
logical cases (where the correlation between the decrease in the lengths 
and the position of the correct prefix was not as expected) were also en- 
countered. In these rare cases, we observed at least one of the following 
phenomena: Either the generators a; could be written as a product of 
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Figure 2. Position of the correct prefix in successful runs 




FIGURE 3. Position of the correct prefix in unsuccessful runs 



very few Artin generators, due to several cancellations in the product 
defining them (recall that each generator a« is a product of 10 random 
Artin generators in B 8 ), or else some (but not all) of the Artin genera- 
tors multiplied to obtain Oj were cancelled when multiplied with some 
of the Artin generators defining aj (or its inverse), so that the resulting 
element x could be written using much fewer Artin generators than ex- 
pected. This violates the required monotonicity of the length function 
and makes the algorithm fail. 
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2.2. Working in Bn when iV is larger. For the parameters lists 
(2, 16, 8, 8, 2) and (8, 16, 8, 8, 128), we have checked the success proba- 
bilities for N = 8, 10, 12, 14, 16, 20, 24, 28, 32, 36, 40, 50, 60, 70, 
80, 96, and 100. The results are shown in Figure HI While the success 
probability decreases with N, it does not become as negligible as one 
might expect. Moreover, it can be significantly enlarged at the cost of 
increasing M. 




3. Concluding remarks 

Our results suggest that whenever G is a finitely generated subgroup 
of the braid group, which is obtained by a sufficiently "random" pro- 
cess, and the involved parameters are feasible for handling the group 
elements in the computer, it is possible to solve equations in the given 
group with significant success probabilities. This significantly extends 
similar results concerning the conjugacy problem (with known param- 
eters) obtained in other works (e.g., [S]). 

This approach seems to imply the vulnerability of the key exchange 
protocols suggested in [lj [7] , since their security is based on the diffi- 
culty of the Conjugacy Problem in "random" subgroups of the braid 
group (see Example |1.2p . It should be stressed that our experiments 
were performed with a small amount of memory (parameter M), which 
could, in feasible settings, be increased by several orders of magnitude 
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and therefore significantly improve the success probability. Since even 
a small non- negligible success probability in attacking the protocol im- 
plies that it is not secure, it seems that in order to immune the current 
protocols against the attack implied by the results here, the working 
parameters have to be increased so much that the system will become 
impractical. 

However, in order to use our approach against newly proposed pro- 
tocols based on the braid group (see [1]), or against similar protocols 
based on other finitely generated groups, one must first find a good 
length function for the specific problem. 
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